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ABSTRACT 

The feasibility of employing survey data that uses 
the states of the United States as the societal units in macro-level 
sociological research was studied. The range of issues available for 
research opened to investigation would be greatly expanded if it were 
possible to aggregate national survey data to produce state-level 
statistics for variables such as approval of violence and gender role 
attitudes* If such data were to be available, thay could be used to 
investigate issues such as whether state-to-state differences in the 
degree of approval of violence explain part of the huge differences 
among states in the rate of violent crimes. The validity analyses 
reported in this paper were conducted using data from the 1975 
National Family Violence Survey, the 1985 National Family Violence ' 
Resurvey, and the 1972-84 cumulative General Social Survey. Analyses 
provide information on concurrent validity, construct validity, 
validity of specific variables, and multi-indicator indexes. It is 
concluded that aggregate survey data should be avoided unless there 
are strong reasons to use such data despite the problems. (TJH) 
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OBSTACLES TO USE OF NATIONAL SURVEY DATA 

survef °^ employing the vast storehouse of national 

survey data which has accumulated since World War IT is arrrro^r,?^ 
reality is much more limited. attractive, the 

Inadequate S-Ttip. Sample 

»-,~- 1 , , ^i>j.^ ui^e cnac IS ai. least double the c-i^rp r>f t-Vio 

typical national survey. Moreover evpn ^n 1o>-,tp xt ^ne size ot the 

=t.te= represented by a', few T Tr "thSe ^fp^o'd/n^r'^ ^""^ ""^ 

Adding to the problem of sample size is the fact that M^p-tp 4. 
clear criterion for determining the' minimum number of case It depends on 
the use being made of the data. If, for examnle th^ r^.^^'^'^l °" 

c=) rnrS:""-^ K^T^ -^i- e~;^Ya/"dete™S 

=tate-by-state descriptive =tatl=tic=. BhiTe tS^rT^ay be eLeStiZ 
such as presenting such data for the ten Jareest stated .^^."^ ? i 
statistics produced from individual-level surieyf fhouia used ^^Iv to 
investigate relationships between variables by techniours such =° 

sStL'ti"ron"the"^""'°"'/"'' used' toTresent descrfp-.iv 

suanistics on the percent of people in a eiven state nr- c^ot-^.. u u 7^ 

certain attitude, or who have 'certain charfcteristics . " 

shoulffttuf reporting results based on data aggregated from surveys 

Agression etc^r'irr' °' -l^i^^i- (cross-tabs. ANOVA. correlation 
legression. etc.). It is generally best to avoid reporting the score for a 
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particular state because (for the reasons given above) there is a lar^e 
risk of error connected with any one data point ^ 



Limited Number of States 



One method of ovrrcoming the problem of states represented bv f^^ 
c>uj.vcy IS extremely large, this results in a drastic -edl!r^^nr. ir. fU 

rs'^oo' T^'^^"'^ ^^^^y-- exf^p e a s'^ey with an « 

of 5.000 might include respondents from 40 states. But even if one sets I 

to^IrT tTal^stfteT^^ - - ^-P^^ li^^ly 

Sample Design Not T Tsuallv Ap p ropriate 

level^'^^nnrw" 'T^^, ^^^^^ ^"°"Sh to aggregate to the state- 

deli.;.n K ^ ^^"^y" P"^^"^- national surveys are 

^^^^^^ representative of each region, but not necessarily of the 

states within each region. For example, as the first step in the sLple 

Howe^; 1' at: ^^^^ ^^ht be randomly sa^'^d 

Ro^^ ; i ^ °^ counties might be selected 

Both of those could be rural countip.; ^"■^e"'- sej.eccea. 

Thus it is T^n^^lM.^^ u '^°""^^^f> °^ both highly urban counties. 
Rnc^or, . P°"^^^^ t° have a "sample- of Massachusetts in which the 
Boston metropolitan area is not included at all. 

mav °bstacles listed above are formidable. On the other hand, there 

anffi'-' promising to encourage the more extensive methodological 

analysis presented in this paper. iiuuoj.ogicai 



METHOD 

Data 



validity analyses reported in this paper were carried out usine 

?Straur" ??e\r''Tp«n'"r^" '''' '^'^y Violence Su^ey 

(Straus and Gelles. 1980; Straus. Gelles and Steininetz. 1980) the 1985 
National Family Violence Resurvey (Straus and Gelles. 1986). and the 1972- 

sur^eT^se^Urtiv 1 'T'^' 'T^' ^'^"'^ '^'''^ '''' ' °f ^e - 

surveys use relatively large size samples, and each have been the basis 

fn thHecT? P^^^.^-^^r v. ^"^^y "^^^ -^---^bed in more detaU 

in the sections where the findings from that survey are presented. 

rp.n.^^ state-level variables were created by computing the percentage of 
respondents in each state who expressed a certain opinion or who reported 
^ .T.. , °^ ^.ocioecomic characteristic. An example of^using 

individual attitude data to create a measure of social norms ffr each 
} the percentage of respondents in Alabama. Alaska. Arkansas, etc 

tlJr T "'I'^u P'""^'y- ^'^^P^^^ °f individual beh;vior 

measures to create behavioral structure measures for each state include 

* SR42.P.SR139.140ctober87. Page 3 



Sfir f ' f population each state who ami a handgun, who slapped 

their spouse curing the previous 12 months, or who drank more th?n f 
certain number of ounces of liquor during the week of the survey 



Concurrent and_Constru('.t Validity Analy c^pc 

correlated with the same variable as eiven in ti c ^ 

eacn suaue with th3 median income as reoor^pH ir^ t-ul 

ron Q ^ Y-i I r»t- ^r^M^4*. -I . " Leporcea in the census. The 



Ambiguity In CritP ria For Val^r^^t-y 



to 77 ^ith " ' coefficients shows that they range from .08 

^alidiVv • °^ Cronbach comments "It is unusual for a 
validity coefficient to rise above 0.60 " uiiui.ucix ror a 

is e^n forrof °^ established standards for judging concurrent validity 
PC- r^ly ^^ser^^" ^ ' JL^^^V^^^^^. 

s^^ptn^fd t:: imSr -sjr ^-Tirz£^ 

representative, the measures used in studying that sample are valid.*! 

deriv^rcirte:fa;ittrd%cfded^r^^^^^^^^^ 

Sh^f^abt^ 3.r^ - Leff£?ent1^f^^?7 de^rtid^S:: 

the ^Brir^^^- ^^^i^^i^y "fers to the extent to which 

Variables follows"° ^r ''T'" '"^^ "^"^""^ ^"""^ ot:her 
variables follows a pattern that is consistent with theoretical or 

empirical knowledge (Cronbach. 1970; Nunnally. 1978; Straus 1964) . "L" 
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a measure of the caloric intake should be correlated with feeling hung-y 

«f% !, '^^L'^" subjective experience of hunger is caused b^ 

lack of food intake. Of course, the correlation will be less than 1 CO 
because there are other factors which also influence subjective feeUngs 
!m^!;"^1V-> J^"^ ^""^^ ambiguity as to the size of the coefficient 

which will be taken as evidence of construct validity than there is for 
concurrent validity. This is inherent in the process. If the theory 

ind^n T""^ ''^ . ^'^^^^^ "^^'"^ specifies a close linkage between the 
independent and dependent variable, then a large correlation is needed; 
but if (as in most theories) only a weak bivariate relation is posited 
because of the numerous other factors which are involved, then low 
correlations provided they are statistically significant, kpport thl 
construct validity of the measures used to test the theory. 

THE 1975 NATIONAL FAMILY VIOLENCE SURVEY 

survefrti^^'V' exploration of the possibility of using individual- level 
survey data for state-level macro-sociological research grew out of the 
importance of testing the theory that "wife-beating" is one of many 
mL mechanisms which serve to keep women subservient to men 

(Yllo 1983a b Yllo and Straus (1984). since wife-beating rates were not 

dlcidel : ""'^^^ ^"'^^ or'states. it ;L 

decided to create estimated rates for each of the states included in the 
National Family Violence Survey (NFVS). 

Sample 

of 9 ^7?/^ ^'^"'iy of a nationally representative sample 

^fv • 107." 4. "'^'^''^^^ °^ ^^^^"g "i'^^ ^ P^^'^"^^ °f the opposite 

sex m 1975. The survey included respondents in 36 states. The number of 

than' ^""^ ' '° "^^^^ « representeHy less 

than 20 cases per state. The data tape is available from the 

SSr'TqJn ?.p.T°"'i"" Political and Social Research (Straus and 

Gelles, 1980, ICPSR study 7733). 

Concurrent Validity 

Concurrent validity was investigated by computing the correlation 
between five state-level variables created by aggregating the survey data 
with five census variables which measure approximately the same 
characteristic. These correlations were computed for the entire set of 36 

llTlu ^tn " "Pli'^ated after deleting the six states represented by 
less than 20 respondents. ^ 

(Table 1 about here) 

The correlations between the survey data estimates and the census 
data using all 36 states ranged from .13 to .68, with a mean of .46 (Table 
1). The correlations using the 30 states with N's of at least 20 cases 
the' state -77, with a mean of 58. Thus, even when 

the state-level variables are include states with fewer than 20 
respondents, the correlations exceed the average reported in psychometric 
validity studies, as summarized in the preceding section of this paper. 
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sn J / tiuoui- xuu cases, „i. which were states in the 90 t-r, 

50 respondent range. In addition, the census variables use a di?f.in^ 
base population than the survey data, i e all adultmtlL f different 
married males living with spouse /sur^-.v^ r ? (census) versus 

en^erationdatawou!dno?pr^L":e\^p:?f7cV^ ^^^^^^ 



Construct Valid ity 



Strauf°ran:^/s:r (YUo i™^ nL^''?9^'i: V ."'^ 

re'fSrar"' SslL'^witT hv^^tr '^^^^ this trv^y" T^f 

the. fore provide "arTeas\"L.^~^^^^ ^t^^Z^i^^ ^^.^ 
the incidence rates based on aggregating survey data. ^^l^dity of 

THE 1985 NATIONAL FAMILY VIOLENCE RESURVEY 



Sample 
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Interviewing for the National Family Violence Resurvev miTTTDN 

:= =T„k ts"Lr rs?^i'H^3- s,- 

Columbia made it possibTp i ^^^^ ^''^ District o:: 

^ iiujiiucj. oi: cases per state ranees from 7 ^n S7n qn.4« . 
more systematic invesuieation nf -ll I I 1 ""^"^^ permits a 

variables computed from this survey Is ereator tttn 51 ,? , } ! 

::re".o':Sertii'" »srt^otu^ot ^^^Aht^ofihf h 

were designed to be representative of specific states. 

Effect Of State N On Concurrent- Valtdlt:y 

(Table 2 about here) 

Nu„be?:frses Pers^attUan ^'""1 °' "^"^ ' 

udbcb rer bcate provide data on the extent to which va^■ff^-^^^r -lo 

number of respondents p,jr state, i-i-^uu to cne 

^wh^.^°"''"''^ ^° hypothesis, the correlations in the first column 

(which uses only states represented by at least 100 respondents ) are ^^ 
much greater than those in the second column (which uses ttat^^^^^ ^H^^s 
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IZf^^ /^T IT^^: °^ ^^^"^ ^^^^'^ ("hich includes all 50 

resnond.n^ i v,^"'^" °^ Columbia, regardless of the number of 

respondents in each state. The reasons for these unexpected results are 
not clear. Perhaps there is a trade-off between grains from a more 
adequate sample size and losses from a more restricted number ff states? 

Table''2Ti^^;'^h °^ ^"'^^^t ''t^"^'^ coefficients in tho bottom row of 
;j!f« 4 . average in the bottom row of Table 1 shows that the 1985 

coefficients are over 50% higher than the coefficients based on the 1975 

"^'k'k?''" ""''"''^ °' state-level estimates from the 1985 

study is probably a result of the combined effect of the much larger 
sample studied in 1985 and the fact that this sample was selected to be 
representative of each state in the study selected to be 



Construct VaHH4t-y 

(Table 3 about here) 

KT 3 relates 12 state-level variables created by aggregatine the 

National Family Violence Resurvey to the state- level to variSlef basid on 

del States'^^r' '"f '^"^"^ ^^^^^ Statistics ^f the 

variables In Table the correlations in Table 2. the dependent 

variables in Table 3 are not intended to be measure of the same variable 
as was measured in the NFVP.. Rather the pairs of variables c'relat^d ^ 
Itlt . °^ °^ theoretical assumptions. ConsequentlC 

otth\^Nm^\"°ableJ'.^°^^'^^ ' °^ ^'^-^"^"^ ^^^^^ 

gffect of Sample Size . The first of the two columns of validity 

Wstr ci'of%'^ ''f' °' "^"'^ ' "^^^^ °f 50 statL and he 

District of Columbia, whereas the second of the two columns uses only the 

36 states represented by 100 or more respondents. Comparison of the two 
columns of correlations in Table 3 shows that, without exception Se 
ZZl Ln^. second column are higher than those in 'tir first 

column. Since the second column was computed using the 36 states 
findirr/.'/ respondents, this would be' an unremarS" 

finding if it were it not for the fact • that it is inconsistent with the 

S^ordL to l/h'- . '^''^ ' "^"'^^ --11 differenced 

according to which set of states is used. Since no explanation has been 

developed for the Table 2 findings, and since those based on Table 2 seem 

to be more plausible, it seems safest to conclude that the most valid way 

to analyze these data may be to restrict the analysis to the 36 states 

which are represented by at least 100 respondents. 

Homi£i^. Part A of Table 3 shows the relationship of three state- 
level variables based on the NFVR to the state homicide rate. The findings 

Specifically the first row shows that the larger proportion of the ■ 
population of a state who regard it as permissibl! for a'husband tTo hit 
his wife the higher the homicide rate. The second row shows that the 

.of uru"^\'"" °" ^ ^^'^ intended to measure overt aggressive 

acts, the higher the homicide rate. The third row shows that the higher 
the percent of the population who are black, the higher the homicide rate. 

i particularly strong, which is consistent with 

the fact that the homicide rate among the black population is several 
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slraus.^Tg^yT. '"^^ '"^^ ""^''^ population (Curtis, 1975, Plass and 

Alcoholism Part B of Table 3 follows the same logic for state- level 
^enoS K ^T'' °" -If— Po"ed alcohol use. All fight of the self- 
report based measures of drinking were found to be correlated wifh 
estimates of alcoholism based on the death rate for cirrhosis of ^he 

.sMn,!P^" ^ °^ ^^^^^ 2 shows evidence of the validly of state 

estimates based on the income question in the NFV Resur^ev 

inco: "in°"?abir2'br validit^ coSons S 

Ih^iT f . ^^'^^^^ the dependent variable is the percentage of 

children living in poverty, not the mean or median fan,ily income^ ^ 

Stress and Psyc hological P rnV>T omc pp^f n toK1« o 
v.Xidl.y ot three La^urL of p.y'i:Ugl/A"„ellti^t\% VeTopui.S^ 
in Mch state. The question is whether one can use the results S ^M^ 

sSessfu^Ti^ even?..\^'°" higher the rate of 

hiri - -V-^^^^ a^-^linf sar:LtdV^^^^^^^^^^^ 

swfats). " ' °' psychosomatic complaints (such as hLdackes coU 



THE GENERAL SOCIAL SURVEY 

This section presents the results of validity analvses „c^na t-i, 
cumulative data file for the General Social Survey^GSS) Se extreme^ 
n Se"sl Lde'it^°'".i?'°°° '^^^^^ wideUgeif tfpicrco'S 

GSS su^eys used Tr thLf '°T'' """'""^ coefficients. The 

individual-level files. *3 ^ °^ ^^'^^^ '^^"^ the 

only ^rftftes'^Ld^'he^N"""" "^P-^-ts were drawn from 

7 o/ ! f ^"'^ ^"^^ ^ P^^ state ranges from lows of 16 for Mississ^nn^ 

CaUfornS' ^Tet are^' ^ ''^'^ °' ^'^^^ Yo^k':L"\"583 1o 

oaiitornia. There are 34 states represented by 50 or more respondents. 



Concurrent Validity 



(Table 4 about here) 



Thirty eight of the individual- level variables from the GSS were used 
cenT .^Kf variables which correspond, at least partly to 
census variables. Correlations between these 38 GSS variables and 60 
census variables are presented in Table 4. -^aoj-es ana bU 

I f feet of Sample Size . The first column Table 4 gives the concurr^nh 
validity coefficients when all 41 states in the GSS 'are used. JncJuding 
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seven states represented by less than 50 respondents. The middle column 

the far right uses only the 23 states that were represented by 200 or more 

^h«^^^'^''• "^1^"^' °^ ^ '^^"'P"^-^ to test the hypotheses 

that the larger the number of cases used to create a state-level ?^riable 
for tW variable. It gives the average validity coefficient 

for these three "samples." Surprisingly, there is almost no gain in the 
thrfo '^'7 coefficient when the seven states represented by less 
than 50 respondents are excluded (r - .34 and .36). Moreover, there is 
on y a moderate gain in validity when states with less than 200 
respondents are excluded (r - .45). This replicates the findings from the 
Tl '"^'^^'^ "^'"^ ^-^ly Violence ResurJeJ 

.o^f fT^^^'^P^ °^ Sp pcfffr yarfnblo^. Another puzzle is why the validity 
coefficients vary so greatly from variable to variable. For example the 
rows in Part A of Table 4 for variable GSlTl show a highTeve'l of 

Tc^'lT.,::^'^^"' ""^^^"^ °^ Civilian ?.bir Force 

(CLF) participation rate, and the rows in Part C foi. variable GS3T3 show 

otheTh" . ^ if "^'''5'^ "'^-"^^ of divorce On the 

near zero ' -^^f-i^-^^ Part D of Table 4 are low. some of tSm 

, . C°"'P^rison with Family Violence Survpv An important finding of this 
GSS Lri kT'^m' '""^."'"^ "^^"Se validity coefficients foVSe 

for 4e 1985 Na\\ f ^f' \ ^"^"^e validity coefficient 

tor the 1985 National Family Violence Survey (last row of Table 2^ 
Consistent with the hypothesis posed earlier, the vauf of state level 
variables derived from the GSS is substantially lower, even though the Jss 
sample is many times larger than the ]985 family suArey sample^ It seems 
plausible to attribute the lower validicy of the GSS sta'^e- level variables 

eLh staS in thTf ' '"t^ ^° representative of 
each state in the sample. The other side of the coin, however is also 

arrasTiU"""'ll current validity coefficien 

are as high as they are. Moreover, as will be shown in the next sec-ion 
hl^he'rth"'' of variables computed from the GSS is "s h ghe'r ot 

higher than validity coefficients found anywhere in the sociological or 
psychometric literature. ^J-oxogicai or 



Construct Validity Of 
Hulti- Ind icator Indpvo.; 

original reason for creating state-level variables from the GSS 
was to measures of constructs needed to test certain theories concerning 

support ?or viT °' "'"^^^ ^ " °f 

I J \^°}^''''^' ^""1 ^ nieasure of sexual liberalism or tolerance 

I^dex and^'th? ^H^S'^f ""^f"' "^^""^^ categories for the items in each 
index and the method of combining indicators to form the composite ^ndex 
1987)^ in State and Regional Indicators Archive Codebook (Straus? 

a^^^^ ^^"?^ UberaHsm Tndpy The indicators making up this index are 20 
attitude items each of wh.^ch is in the form of the percent of respondents 
in a state who express a favorable attitude toward abortion, allowing 
homosexuals to teach in a college, permitting teen agers to have access to 
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contraceptives, and favoring sex edvication in the sliooIs THp in^^v t, 

Jaffee and Straus (1986) found a cnrrpl nt-r,>r. ^ cn u 
index and the circulatio^ rati of nL's trnrsex^^ga^ nef in'^^^^^^^ 
Since these two variables have a very different Sigin and s' 
refers to attitudes and the other to overt behavior (purchase of exuaUv 
explicit materials). this correlation is evidence s^eLstinrhiih 
construct and discriminant validity. eviaence suggesting both 

comp ute°lhlc%nH^' ""'l ^1^7- ^^^P°"^^^ ^° 1^ GSS questions were used to 
compute this index. Each item consists of the percentage of respondents 

SnaTty beutvf thaTm "''"^'irK"^"'^'"^ gun'permltsf favor^thTd ^ h 
M^^w' ^^^i^""^ ^^^^ should be spent on the military, and approve of 

lilt^ another person under a variety of different conditions T^f 
Violence Approval Index has an alpha coefficient of reliability of .'e? 

rhis ' l'""'. ^^^^^^ ^ correlation of .40 bctvec 

n ^"'^ ^"'^^'^ designed to measure "legitimate violence" rs^rau^" 
1985b: Baron and Straus, n.d.). The two indexes are intended to LasS^ 
^Jolen^ "ntZl^'''^ " ^^^'^'^^ Permissible and/or ap^r'ed 

^: Tf^he^^^^^ t/rov-r ^ts^-Lt^ t-hrl^ 

magazines. Since the Legitimate Violence Index is based on "objective" 

TttVtJ^r. T '° ^PP"^^^ ^"d^'^ "hich is based on GSS 

attitude data, the correlation suggests the validity of both measures 

Baron, Straus, and Jaffee's test of the "cultural snmov=>-" 

They hypothesized that the higher the support for culturally permissive 
violence tne higher the rate of criminal violence. This h^oSesis was 

t^e^GSS fnd%hf L°\'ti ^ ' v/T" "^P^""^' '"'^^'^ ^^'"P"^^'^ on^he ba is ol 
such as site l^^^^'^Z '"'^"'^ "°'"P"'"'^ P"''!^- data 

Guard. ^"'^ expenditures per capita on the National 

SUMMARY AND CONCLUSIONS 

T!!:°"^TT V^^/ provided evidence of the validity of using the 
I of the United States a. units of society for macro- sociological 

research (Straus, 1985a). However, a practical limitation arises Jefjite 
t-h. large amount of census and other public record data on the spates 
a-use many important issues can only be investigated if it ^s possible 
,gre..tc individual- level survey data as a means of me^isurTng tL 
or states . ° 

a.-^'^n'^^"'^^ variables are frequently created from national surveys 
c^L- c ^^"/^"^^- ""t^onal comparative research but have not been U3ed for 
cros.-c.at£ comparative research. Some of the obstacles and problems 
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connectea with using survey data to measure properties of states are 
discussed, including an inadequate number of cases per state in a typical 
national survey the fact that national surveys using include respondents 
from only about 30 states, and the fact that the sample for most national 

re^n^Lni! '^^"^f ^° representative of each state where 

respondents are interviewed. 

The main part of the paper reports empirical studies of the validity 
of state-level variables created by aggregating individual -level surveys. 
The empirical analyses use three large and well proven surveys to create 
Lsno'J^^ "^l procedure was to compute the percent of 

respondents in each state who expressed a certain opinion, or who reported 
having certain characteristics. Two types of state-level variables were 
computed: variables needed for substantive analyses, such as the rate of 
v,ife-beatiug, and variables which correspond to census or other public 
data The latter provided a means of estimating the "concurrent validity" 
f.J T:^''^\^f'^i^^^ '^"^^^'^ aggregating individual responses for 

vaUri?^v"^' r\ f°™f'' P"''^'^^ ^ "^^"^ °^ investigating the "construct 
validity" of state-level variable., created from individual- level survey 



Concur rent and Construct Validity 

Almost all the concurrent validity coefficients (correlations of 
state-level variables created by aggregating individual- level survey data 
with measures based on census and other public record data) exceed the 
ll'erfturr7"^7r' "'".''^''^ coefficient reported in the psychometric 
.^^'ffr ^ V' ^ considerable margin. For the survey best 

^aslound llTe '''''''''' ^"^"^^ ""^'''''^ coefficient 

.nnn.!^^ ^"f.^ validity provide considerable evidence 

supporting the idea that conceptually valid variables can be created for 
the states of the United States by aggregating individual- level survey 
data. The construct validity of variables based on the General Social 
Survey is particularly encouraging because this is such a widely used data 
socieTy ""^"^^ variables measure so many key aspects of American 

In general, the size of both the concurrent validity and the 
construct validity coefficients is remarkable because the samples for two 
of the three surveys used to create the state-level variables were not 
drawn by methods intended to create a valid sample within each of the 
states that happened to fall in the survey, and because for each of the 
three surveys a substantial number of states were represented by 25 or 
fewer respondents. j ^.j 



Conclusions 

The discussion up to this point has tended to interpret the results 
as indicating a surprisingly high level of validity for state-level 
variables computed by aggregating individual -level survey data. However 

\u,?/ ^^er^ge is higher than was expected, there is considerable 
variability. Moreover, even the average needs to be looked at from 
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ex^lf Perspective A concurrent validity coefficient of .50 for 

^Sa^oi t:s ttrstr^-^-tr^ t:^^^^^^^ ^^^^^^'::^! l:z 

aware of both the strengths and the limits of any set cf measures Th^ 
results reported in this paper are a step in that direction 

What do these findings tell us about the practical issue of whether 
to proceed with uaing variables created from individual-l^versurvrys On 
the one hand, the evidence of concurrent and construct val^iitv 
encouraging. On the other hand, the typical nationar su'ey is\ot 

S^of case" "" findings and has an iLdequaL 

ra-moer o^ ^ases per state. Given these conflicting consideration Ih^y^ 
can be no general recommendation. Perhaps the most prudent approach is to 
conclude that aggregated survey data should be avoided unle's's there are 

arHLrT' ^° "fK.'"'^ ^'""^ '^'^P''" P^blems. Among those reasons 
are lack of a feasible alternative, the uniqueness and importance of the 



FOOTNOTES 



1. The situation is almost the opposite in psychologv Relativp ^« 
sociologists psychologists pay much more attention'^? the validity tfth^ 

:L^;L"isToVcrciai.^"^^^^^^^^ ^'^^^ ~ 

2. Respondents were selected by four methods. A national Drobab^^i^v 
sample of approximately 4.000. oversamples to increase the n^berof blacj 
and hispanic families, and an ovarsample to increase fh^ f L 
states to 100 per state. The oversampLs haveTen wei^^tod toTnabl^^U 
iLTJ "'"'^ ^ nationally representative sample. ?Se state- 
s^Ipie. '''' '"'^ P^P^^ ^-^^ - this' weighted totll 

Tj c ^^^^ ^° ^'^P^ess my appreciation to James A Davit and Tom 

Lse sLti°:ti'c7\''^? T ^^^^-by-^-^^ ^^atistics used in this paper 

S:::r:h"c:Sr 'fecfuse^ thrGs's''s''\"^" °' °P^'^°" 
wciiucj. Decause cne GSS sample is not designed ^o ht^ 

representative within each of the states'. Consequently tbf public use 
data tape does include a state identification code ^ 

coeffJcient^of "'II '"'^^'""'^ '° "g^^'^ ^^"^g- validity 
JrnsvchoT.° / indicating the low standard prevailing 

in psychology and a low validity of state-level variables created bf 
aggregating survey data. That may well be true, but it does not follow 
coSf.T : °^ ^^^^ is a low average vaUdity 

bLaSe there f 'b' T^'. --^-^-s are almost surely not warranted 
because there is no basis for comparison. Previous analyses reveal that 

BJa?o^l'"l979'Ttraur\97Z°". "^''''^^ coefficients fo'r thefrTeasS:: 
^fiia ock. 1979 Straus. 1964; Straus and Brown. 1978). That situation 
continues to this day. In the 1986 volume of the A:.erican yo.Ll"p l!? 
Review, not one investigator reported a validity coefficient. ^ 
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Table 1. Concurrent Validity Coefficients (Correlation 
of Survey Estimates with Census Data) For Five Variables, 
1975 National Family Violence Survey 



Survey Variable & All 36 30 States 

Census Variable States With N-20+ 



Median income of husbands .48 .56 

by Median income of males 

% Husbands Employed full time .13 .24 

by % Males employed full time 

% Husbands completed High School .56 .76 
by % Males 17+ who completed HS 

% Wives who completed high school .68 .77 
by % Females who completed HS 

MEAN VALDITY COEFFICIENT .46 .53 



Table 2. Concurrent Validity Coefficients (r) by Minimum Number of Cases 
Per State, 1985 National Family Violence Resurvey 



Variable 


Minimum 
36 states 
with N >100 


Niimber of Cases 
39 states 
with N >25 


Per State 
51 states 
with N >7 


Mean 
Correlation 


% Age 65 


.44 


.48 


.49B 


.47 


% Black 


.95 


.87 


.85 


.89 


% Hispanic 


.95 


.94 


.94 


.94 


Median Income 


.76 


.77 


.68 


.74 


Mean Income 


.81 


.82 


.74 


.79 


MEAN VALDITY COEFF. 


.78 


.77 


.74 


.77 
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Table 3 Conmstruct Valdity Of 1985 National Family Violence Survev v.r-f.Ki.c 
(correlation with public record state-level variables^ for !ii ^ f^ ^t^. 

states with 100+ respondents variables), for all states ^..d for 36 



NFV Resurvey Variable 



Public Record 

State Level Variable 



All 36 states 
51 States with N>100 



Approve of H slapping wife under 
some circumstances (vb49pl) 

Physical Aggression Index of 
men (vb57hcl) 

% black (vbf5pl) 



A. HOMICIDE RATE 

Homicide rate per 100k pop 
1975-80 (vbh4> 



.34 
.13 
.69 



.37 
.40 
.80 



% Never drink (vb65apl) 



% drinking 3+ days per week 
(vb65ap2) 

% husbands drunk 2+ times in 
last year (vb66ah2) 

% wives drunk 2+ times in 
last year (vb67aw2) 



B. ALCOHOLISM RATE 

Alcoholism rate, 

(vll2) 
Alcoholism rate, 

(vll3)2 

Alcoholism rate. 
Alcoholism rate. 

Alcoholism rate. 
Alcoholism rate. 

Alcoholism rate. 
Alcoholism rate. 



males 1977 -.32 
females, 1977 -.38 



males 1977 .20 

females 1977 .28 

males 1977 .36 

females 1977 .22 

males 1977 .19 

females 1977 .32 



.57 
.58 



.49 
.53 

.49 
.42 

.57 
.58 



Median family income 

(vbfSml) 
% with income below $10,000 



C. POVERTY RATE 

% of children in families with 
below poverty $ 19?? (cbl39rx) 



.27 
.39 



.48 

.58 



Subjective Stress Index 

(vb63yp2) 
Depression Index 

(vb63xp2) 
Psychosomatic Complaints Index 
(vbeSzpl) 



D. STATE STRESS INDEX 

State Stress Index, 1976 
(txl5) 



.03 
.29 
.01 



.20 
.30 
.42 



}.0 
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Table 4 Concurrent Valdity Coefficients (Correlations Between GSS -Based Variables 
Nearest Equivalent Census Variable) for Three Sub-sets of States variables 



GSS and Census Variables 



States 
All with 
States 50+cases 
N-41 N-34 



States 
with 
200+cases 
N«23 



A. Employment 

GS1T4 Percent Laidoff with 

Census CLF Unemployed 1970 
" " 1976 
1978 



GSITI Percent Employed Full Time with: 

Census Percent of Pop in the CLE 1970 

" Employed 1970 
" in the CLF 1978 



tt 
tt 



GS1T7 Percent Keep House with: 

Census Percent of Female 17 & Work 1975 

" 17 & Work 

Looked, 1975 

B. Occupation 

GS2T0 Percent Prof -Tech A with: 

Census Percent of Prof .Manag. 1970 

Tech. Kindred 1976 

GS2T1 Percent Prof -Tech B with: 

Census Percent of Prof. Manag. 1970 

Tech. Kindred 1976 

GS2T2 Percent Manag. -Admn. -Sales with: 

Census Percent of Prof. Manag. 1970 
Manag. Admn. 1976 
Sales 1976 

GS2T3 Percent Clerical with: 

Census Percent Sales -Clerical 1970 

GS2T4 Percent Craft with: 

Census Percent of Crafts Foremen 1970 

Kindred 1976 

GS2T5 Percen^^ Operatives with: 

Census Percent of Operatives-Trnsprt 1976 

GS2T6 Percent Transport-Labor with: 

Census Percent of Operatives-Trnsprt 1976 
" Laborers non-Farm 1976 



GS2T9 Percent Service with: 

Census Percent of Empl. Service 1976 
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.27 


.38* 


.42* 


.29* 


.27 


.41* 


.15 


.13 


.28 


. 56*** 


. 64*** 


. 55** 


. 59*** 


. 69*** 


.58** 


.49*** 


.46** 


.48** 


-.45** 


- . 41** 


-.32 


-.45** 


- .43** 


-.39* 


.07 


.23 


.26 


.21 


.38* 


.38* 


.41** 


.41** 


. 53** 


.41** 


.32* 


. 52** 


.28* 


.27 


.45* 


.08 


.20 


.44* 


.23 


.22 


. 50** 


.15 


.49** 


. 61*** 


.20 


.18 


. 53** 


.26 


.23 


. 55** 


.24 


-.32* 


-.05 


.69** 


. 75*** 


. 66** 


.18 


.15 


.42* 


.19 


.26 


.49** 




Table 4. (Continued) Concurrent Valdity Coefficients for GSS State-Level Variables 



or-oo., „ C. Marita l Status 

GS3T1 Percent Married with: 

Census Percent 14 & Married 1976 

1970 



.18 
.14 



GS3T2 Percent Widowed with: 

Census Percent 14 & Widowed 1976 

1970 

GS3T3 Percent Divorces with: 

Census Percent 14 & Divorces 1976 

1970 



.32* 
.37** 



, 53*** 
. 50*** 



GS3T5 Per( 
Cent) 



•t Never Married with: 
'ercent 14 & Never Harried 1976 
Single 1970 



.35* 
.31* 



oo/a,« . 0. Children 

GS4T0 Percent No Children with: 

Census Percent Families No Children <18 1976 .18 



GS4T1 Percent One Child with: 

Census Percent One Child <18 1976 

GS4T2 Percent Two Children with: 

Census Percent Two Children <18 1976 

GS4T3 Percent Three Children with: 

Census Percent Three Children <18 1976 



GS5T2 Percent 20-29 Years with: 
Census Percent 20-24 1976 
25-34 1976 

GS5T3 Percent 30-39 Years with: 
Census Percent 25-34 1976 
" " 35-44 1976 

GS5T4 Percent 40-49 Years with: 
Census Percent 35-44 1976 
45-54 1976 

GS5T5 Percent 50-59 Years with: 
Census Percent 45-54 1976 
55-64 1976 

GS5T6 Percent 60-69 Years with: 
Census Percent 55-64 1976 
65+ 1976 

GS5T7 Percent 70-79 Years with: 
Census Percent 65+ 1976 



E, Age 



2 ) 



.21 



.24 



.42** 



. 51*** 
.39** 



.09 
.14 



•.06 
.15 



.22 
.10 



.33* 
.31* 



,31* 



.06 
.00 



.05 
.07 



.58*** 
. 64*** 



.37* 
,32* 



.19 



.28 



.08 



. 52*** 



. 63*** 
. 50** 



-.02 
-.02 



.11 
.22 



.23 
.23 



.33* 
.24 



.32* 



.52** 
.35** 



.47* 
,50** 



, 78*** 
, 81*** 



.44* 
.31 



.17 



.35 



.08 



.21 



.60** 
. 64*** 



.18 
.18 



,16 

.20 



,32 
.31 



.29 
.35 



,37* 
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Tabl^ 4. (Continued) Concurrent Valdity Coefficients for GSS State-Level Variables 



GS5T8 Percent 80+ Years with: 

Census Percent 65+ 1976 .22 

^n^n,n ^ F, EdUCatlon 

GS6T1 Percent 0-7 Years Education with: 

Census Percent 0-4 Years Education 1976 75*** 
5-8 " " « ,46** 

GS 6T2 Percent 8 Years Education with: 

Census Percent 5-8 Years Education 1976 .56*** 

GS6T3 Percent 9-11 Years Education with: 

Census Percent 1-3 Years H.S. 1976 .51*** 

GS6T4 Percent 12 Years Education with: 

Census Percent 4 Years H.S. 1976 .51*** 

GS6T5 Percent 1-3 Years College with: 

Census Percent 1-3 Years College 1976 58*** 
1970 .55*** 

GS6T6 Percent 4 Year College with: 

Census Percent 4+ Years College 1976 .46** 

GS6T7 Percent 5-8 Years College with: 

Census Percent 4+ Years College 1970 



.29* 



.47* 



GS7T2 Percent Female with: 

Census Males per 100 Females 1970 



.33* 

G« Gender and Race 
-.08 



GS8T2 Percent Black with: 

Census Percent Black 18+ 1976 

All Ages 1976 



. 64*** 
.66*** 



H. Household Composition 
GS9T1 Percent in 1-Person H.H. 's with: 



Census Percent H.H. 1-Person 1976 

GS9T2 Percent in 2 -Person H.H. 's with: 

Census Percent Families 2 -Member 1976 

GS9T3 Percent in 3-Person H.H.'s with: 

Census Percent Families 3-Member 1976 



.22 
.18 
.04 



,81*** .79*** 
,57*** .50** 



.44** 



.56*** 



, 74*** 



,72*** 
, 67*** 



.46** 



.44** 



.08 



, 65*** 
. 67*** 



.31* 



.22 



.40** 



.49** 



, 64*** 



,91*** 



.71*** 
.66*** 



.48** 



.57** 



.14 



.77*** 
.76*** 



.43* 



.50** 



.28 



MEAN CORRELATION COEFFICIENT 



34 



.36 



.45 



p <.05, ** - p <.01, *** - p <.001 CLF - Civilian Labor Force 
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